Phylogenetically diverse Bradyrhizobium genospecies nodulate Bambara groundnut (Vigna subterranea L. Verdc) and soybean (Glycine max L. Merril) in the northern savanna zones of Ghana

Abstract A total of 102 bacterial strains isolated from nodules of three Bambara groundnut and one soybean cultivars grown in nineteen soil samples collected from northern Ghana were characterized using multilocus gene sequence analysis. Based on a concatenated sequence analysis (glnII-rpoB-recA-gyrB-atpD-dnaK), 54 representative strains were distributed in 12 distinct lineages, many of which were placed mainly in the Bradyrhizobium japonicum and Bradyrhizobium elkanii supergroups. Twenty-four of the 54 representative strains belonged to seven putative novel species, while 30 were conspecific with four recognized Bradyrhizobium species. The nodA phylogeny placed all the representative strains in the cosmopolitan nodA clade III. The strains were further separated in seven nodA subclusters with reference strains mainly of African origin. The nifH phylogeny was somewhat congruent with the nodA phylogeny, but both symbiotic genes were mostly incongruent with the core housekeeping gene phylogeny indicating that the strains acquired their symbiotic genes horizontally from distantly related Bradyrhizobium species. Using redundancy analysis, the distribution of genospecies was found to be influenced by the edaphic factors of the respective sampling sites. In general, these results mainly underscore the high genetic diversity of Bambara groundnut-nodulating bradyrhizobia in Ghanaian soils and suggest a possible vast resource of adapted inoculant strains.


Introduction
Bambara groundnut (Vigna subterranea L. Verdc) is an indigenous grain legume that is cultivated across the entire continent of Africa and also in some parts of southeast Asia (Feldman et al. 2019). In Africa, it is, after cowpea, regarded as the second most important indigenous legume cultivated by farmers on smallholdings (Abu andBuah 2011, Jaiswal and. Its annual world production is estimated at 330 000 tons, of which about half is produced in West Africa, mainly in Nigeria, Ghana, Togo andBenin (Directorate Plant Production 2011, Alhassan andEgbe 2013). In Ghana, Bambara is found growing in both the southern and northern parts of the country, but its production is more common in the northern savanna zones (Guinea and Sudan savanna agroecology) than in the southern regions (William et al. 2016). The crop does well in well-drained soils with a pH of 5.0-6.5 and thrives at an average temperature of 20-28 • C and a mean annual rainfall of 500-600 mm (Directorate Plant Production 2011). Bambara groundnut is drought tolerant and can produce reasonable yields in marginal, low-input soils where other crops may fail (Feldman et al. 2019). Generally, Bambara groundnut is cultivated as landraces and it is featured either as a monocrop or an intercrop for cereals or root crops. The matured seeds provide good nutritional value that can help enhance the nutritional status of farm families (Feldman et al. 2019). Although largely underutilized, the crop holds great potential to address food insecurity especially in areas impacted or likely to be impacted by climate change.
Bambara groundnut, like other legumes, forms a symbiotic association with nitrogen-fixing soil bacteria commonly called rhizobia to provide fixed nitrogen (N) via biological nitrogen fixation (BNF) (Mohale et al. 2014). The BNF process is one of the essential ecosystem services that plays a vital role in agricultural systems by increasing grain legume productivity and helping to improve the carbon (C) and N content of soils (Jensen and Hauggaard-Nielsen 2003). Particularly in smallholder farming systems where access to synthetic N fertilizer is limited due to high cost and inadequate supply, BNF is essential (Houlton et al. 2019). Bam-bara groundnut can fix as much as 98% of its N nutrition by forming a symbiotic association with rhizobia (Sprent et al. 2010, Mohale et al. 2014. The main symbionts of this crop belong to the genus Bradyrhizobium (Grönemeyer and Reinhold-Hurek 2018). This genus is geographically widespread and its species have been isolated from different leguminous plant species in several countries and from different edaphoclimatic conditions (Helene et al. 2020). Validly published Bradyrhizobium species such as B. pachyrhizi, B. vignae, B. yuanmingense, B. daqingense, B. huanghuaihaiense, B. iriomotense, B. jicamae, B. lablabi, B elkanii and B. subterraneum are known symbionts of Bambara groundnut (Grönemeyer et al. 2014, 2015b, Puozaa et al. 2017, Ibny et al. 2019. Although vital for increased food production in Africa, the diversity and distribution of these Bambara groundnut-associated microsymbionts have not been well studied. Thus, more studies are needed to uncover their diversity in African soils and their potential for an adapted inoculant production. In addition, strain identification and characterization would improve knowledge of their contribution to ecosystem functioning and thus provide information on factors that shape their diversity and distribution . This information would contribute toward the understanding of adaptational properties important to the suitability of a strain for use in inoculants at a particular geographical location (Grönemeyer and Reinhold-Hurek 2018).
Soybean plays an important role in the diets of many households due to its high nutritional value. The crop is cultivated under diverse climatic conditions worldwide and accounts for 80% of the land area used for legume production in the world (Herridge et al. 2008). Although native to northeast Asia, the crop is increasingly becoming an important cash crop in Africa with countries like South Africa, Zambia, Nigeria and Uganda being lead producers (Khojely et al. 2018). In Ghana, ∼80% of the crops' national production, estimated at 182 000 tons, takes place in the northern savanna zones (Aidoo et al. 2013). The crop is mainly cultivated for food and income and the haulms are used as feed for livestock. Soybean is capable of hosting diverse groups of rhizobia in root nodules to meet its large N requirement for optimum growth. Rhizobial species belonging to the genera Bradyrhizobium, Rhizobium, Mesorhizobium and Ensifer (Sinorhizobium) are capable of forming effective N 2 -fixing symbioses with soybean (Chen et al. 1995, Saldaña et al. 2003. Diverse Bradyrhizobium species, including B. diazoefficiens (Delamuta et al. 2013), B. daqingense (Wang et al. 2013a), B. elkanii (Kuykendall et al. 1992), B. huanghuaihaiense , B. japonicum (Jordan 1982), B. liaoningense (Xu et al. 1995) and B. ottawaense (Yu et al. 2014), have been isolated from soybean nodules. Previously, soybean was regarded as fairly specific with regard to its requirement for particular rhizobia, but breeding programs have tackled this problem with the release of improved varieties that are promiscuous and can nodulate with indigenous rhizobial population (Abaidoo et al. 2000, Mpepereki et al. 2000. Despite the increasing cultivation of soybean in Ghana, very little information is available on the diversity of indigenous rhizobia that nodulate this legume in Ghanaian soils. Thus, to guide the exploration of the indigenous rhizobial population for strains suitable to soybean inoculant production, studies on the diversity of soybean-nodulating species in Ghanaian soils are needed. Molecular and genetic methods have been used to effectively study the taxonomy and diversity of rhizobia (Jaiswal et al. 2016). For identification of rhizobial genotypes, protein-coding housekeeping genes, such as recA, have been demonstrated by several authors as a good phylogenetic marker (Aserse et al. 2012a, Jiao et al. 2015, Mousavi et al. 2015, Tena et al. 2017, Asfaw et al. 2020). However, individual housekeeping genes may have different evo-lutionary rates or histories among bacterial taxa and hence the phylogeny of a single gene may not provide a perfect snapshot of the evolution of a species (Peter et al. 1996;Tian et al. 2012). Thus, a phylogeny based on multilocus sequence analysis (MLSA) involving several housekeeping genes can provide a more accurate reflection of diversity and taxonomy of rhizobial species (Glaeser and Kämpfer 2015).
We aimed to obtain a general understanding of the presence, taxonomy, ecological distribution and diversity of Bambara groundnut-and soybean-nodulating rhizobia in the soils of the northern savanna zones of Ghana. To achieve this, we isolated rhizobia from nodules of Bambara groundnut and soybean sown in potted soils sampled from the Guinea and Sudan savanna agroecological zones of Ghana. A total of 102 rhizobial strains were recovered from the trapping experiment and these were screened based on phylogenetic analysis of the recA gene. Primarily, considering isolation sites and phylogenetic groups, 54 strains representing diverse clusters were then used in MLSA of six proteincoding housekeeping genes, including genes coding for recombination protein (recA), glutamine synthetase II (glnII), RNA polymerase beta subunit (rpoB), DNA gyrase subunit B (gyrB), ATP synthase subunit beta (atpD) and the 70kDa chaperone (dnaK). To further refine adaptational properties, the phylogeny of two accessory genes, nodA encoding N-acyltransferase required for nodulation and nifH that encodes dinitrogenase reductase required for N 2 , was determined. The nodulation capacity of the representative strains was studied in a growth chamber using Bambara groundnut as trap crop. The physicochemical properties of the sampling sites were also examined to assess their influence on the distribution and composition of the genospecies.

Soil sampling and characterization
The isolates used in the study were trapped from soils sampled from nineteen farmers' fields located in the Guinea and Sudan savanna agroecological zones of Ghana. These agroecological zones are warm, semi-arid environments with a unimodal rainfall pattern that occurs between May and October. The vegetation consist of short deciduous fire-resistant trees that include anthropic species with open canopies, such as neem (Azadirachta indica), shea (Vitellaria paradoxa), mango (Mangifera indica), baobab (Adansonia digitata), silk-cotton (Ceiba petandra), African mahogany (Khaya senegalensis) and locust bean (Parkia biglobosa) trees. The ground flora comprises various grass species of varying height. The fields had a cropping history of maize cultivation and the soils according to the World Reference Base classification system (IUSS Working Group WRB 2015) represent seven soil types (Table S1, Supporting Information). The soil samples were analyzed for physicochemical characteristics mainly according to Van Reeuwijk (2002). Soil pH was measured in 1:2.5 soil/water (w/v) suspension. Organic carbon (C) was determined by the Walkley and Black wet combustion method and N by the Kjeldahl method. Plant-available P was extracted by the Bray 1 method (0.03 M NH 4 F-0.025 M HCl) and measured with a spectrophotometer using the molybdate blue method. Exchangeable Ca, Mg, K, Na, Mn and Zn were extracted with 1.0 M ammonium acetate buffered at pH 7 and exchangeable acidity was determined by saturating samples with unbuffered 1.0 M KCl solution titrated with 0.02 M NaOH. Results were expressed as cmol (+) kg -1 . Mn, Zn, Ca and Mg were measured with an atomic absorption spectrophotometer and K and Na, with a flame photometer. Because the soils were nonsaline, the effective cation exchange capacity (CEC ef ) was calculated as the sum of exchangeable Ca, Mg, K and Na, and the base saturation (BS) was calculated as BS = 100 × (Ca+Mg+K+Na)/CEC ef , respectively. Particle size distribution was determined with the hydrometer method.

Trapping, isolation and authentication of rhizobia nodulation ability
For the trapping experiment, seeds of three Bambara groundnut landraces with different seed coat colors (wine, cream and black) and one improved variety of soybean (Favor) were obtained, respectively, from markets in northern Ghana and the Council for Scientific and Industrial Research-Savanna Agricultural Research Institute (CSIR-SARI), Tamale, Ghana. Four-liter capacity pots were filled with 3 kg of soil and arranged in a screen house at CSIR-SARI. The seeds of each genotype were separately surface sterilized with 70% ethanol for 1 min, followed by a treatment with 3% sodium hypochlorite for 4 min and then rinsed thoroughly in several changes of sterile distilled water (Somasegaran and Hoben 2012) and planted singly in each soil. The seedlings were supplied with water every other day or when required. Nodules were sampled after 45 days of plant growth; at ∼50% flowering stage, washed and stored temporally on silica gel. The desiccated nodules were rehydrated in sterile distilled water for about an hour after which they were surface sterilized first using 70% ethanol for 1 min, then 3% sodium hypochlorite for ∼3 min where after the nodules were rinsed in several changes of sterile distilled water. Each surface sterilized nodule was crushed in 100 μl sterile distilled water after which 50 μl was spread on yeast mannitol agar (YMA) plates containing Congo red (Somasegaran and Hoben 2012). The plates were incubated at 28-30 • C for up to 10 days depending on the growth habit of each isolate. Colonies typical of rhizobia were repeatedly streaked (Somasegaran and Hoben 2012) until pure cultures were obtained. The pure cultures were then stored in yeast extract mannitol (YEM) broth containing 25% (v/v) glycerol at −20 • C.
The nodulation ability of 54 representative bacterial isolates (52 isolates from Bambara groundnut and two isolates from soybean) was tested using the Bambara groundnut landraces used in the trapping experiment under controlled conditions in a growth chamber at the University of Helsinki. The Bambara groundnut was selected for this screening because of its promiscuous nodulation pattern. The seeds were surface sterilized as previously described (Somasegaran and Hoben 2012) and germinated on 0.75% (w/v) water agar in Petri dishes. The plates were incubated at 28 • C until radicles were ∼2 cm long and then the seedlings were transferred aseptically into plastic growth pouches measuring 16.5 cm by 35 cm (CYG seed germination pouch; Mega International), containing sterile N-free nutrient solution (Jensen 1942) and arranged on holders. Bacterial cultures grown to log phase (1 ml) were inoculated on the base of the seedlings after 3 days and then grown in a growth chamber with a 14 h light period (22 • C for 1 h, 25 • C for 12 h and 2 • C for 1 h) and a 10 h dark period at 18 • C. The plants were watered with quarter-strength N-free nutrient solution once a week or when necessary, with sterile Milli-Q (Helsinki, Finland) water.. The plants were harvested after 35 days for root nodule observation. The nodulation was recorded by the presence of nodules and their N 2 -fixation efficiency was estimated by a pink or red coloration inside the nodules.

DNA extraction, PCR amplification and sequencing
Extraction of genomic DNA from bacterial cultures grown in YEM until late log phase was carried out using NucleoSpin Ge-nomic DNA extraction kit following the manufacturer's instruction (MACHEREY-NAGEL, Inc.). The integrity and purity of the DNA were checked using 1.5% (w/v) agarose gel electrophoresis. The genomic DNA of the isolates was stored at −20 • C. The gene encoding recombinase A (recA) was amplified using the primers TSrecAF (CAACTGCMYTGCGTATCGTCGAAGG) and TSrecAR (CG-GATCTGGTTGATGAAGATCACCATG) (Stępkowski et al. 2005). The polymerase chain reaction (PCR) reaction mixture was formulated according to the instruction of the manufacturer (Finnzymes): a 50 μl reaction mixture containing 1 μl DNA template, 5× Phusion HF buffer (1×), dNTPs (200 μM), primers (0.5 μM each) and Phusion DNA polymerase (0.01 U/μl). The PCR reaction was done at an initial denaturation of 98 • C for 3 min followed by 35 cycles of denaturation at 98 • C for 15 s, annealing at 68 • C for 30 s and elongation at 72 • C for 30 s. A final elongation was then done at 72 • C for 10 min and the samples were held temporarily at 4 • C. The quality of the PCR products was checked using 1.5% (w/v) agarose gel electrophoresis after which they were sequenced at the Institute of Biotechnology, University of Helsinki by Sanger technology using same primers used for the PCR amplification.

Sequence data analyses
The quality of the recA sequences was checked using the Gap4 program as implemented in the Staden package 2.0.0 (Staden et al. 2000). The sequences of the representative isolates were compared with sequences of reference strains in the National Center for Biotechnology Information (NCBI) GenBank database using the standard nucleotide Basic Local Alignment Search Tool (BLASTn) program (https://blast.ncbi.nlm.nih.gov/Blast.cgi; Altschul et al. 1990). Sequences of closely related reference strains were aligned with the sequences of the 102 rhizobial isolates using CLUSTAL W in MEGAX (Larkin et al. 2007). ModelFinder (Kalyaanamoorthy et al. 2017) embedded in IQ-TREE was used to determine a best-fit model according to the Bayesian Information Criterion (BIC) and the recA phylogeny was inferred using the maximum likelihood (ML) method (Nguyen et al. 2015). Branch support was assessed with the SH-like approximate likelihood ratio test (SH-aLRT) (Guindon et al. 2010) and ultrafast (UF) bootstrap integrated in IQ-TREE using 1000 replicates (Hoang et al. 2018). A cluster/lineage was considered probable when both SH-aLRT and UF bootstrap values are ≥80% and ≥95%, respectively. The pairwise sequence similarity between the references and representative strains was determined using the BLASTn program (align two or more sequences) (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=b lastn&BLAST_SPEC=GeoBlast&PAGE_TYPE=BlastSearch). For this sequence similarity analyses, we used the sequence used for making the alignments and for phylogenetic trees.
Based on the obtained recA phylogenetic tree, 54 representative strains were selected from the different clusters mainly considering isolation sites and phylogenetic group. The selected strains were then paired-end sequenced (2 × 150 bp) using Illumina NextSeq500 platform at the Institute of Biotechnology, University of Helsinki. For the purpose of this study, the reads were aligned with reference sequences in the GenBank and then additional core (glnII, gyrB, rpoB, atpD and dnaK) and symbiotic (nodA and nifH) genes were retrieved for further phylogenetic analyses. The individual gene sequences of the representative strains together with all available relevant reference strains retrieved from the NCBI GenBank database were aligned using CLUSTAL W. The six housekeeping genes were concatenated using Sequence Matrix (Vaidya et al. 2011). A best-fit model according to BIC was determined by the ModelFinder for all single gene alignments and the edge-proportional partition model (Chernomor et al. 2016) was used for the concatenated gene alignment. Phylogenetic trees were constructed from the single gene alignments and the six concatenated housekeeping gene sequences (recA-glnII-gyrB-atpD-rpoB-dnaK) following a similar procedure as described above. The recA, glnII, rpoB, gyrB, atpD, dnaK, nodA and nifH gene sequences were subsequently submitted to the NCBI GenBank database where they received the accession numbers MW798381-MW798434, MW798435-MW798487, MW798488-MW798541, MW798542-MW798595, MW798596-MW798649, MW798650-MW798703, MW853620-MW853661 and MW867251-MW867298, respectively. The accession numbers of the reference sequences are provided in Table S4 (Supporting  Information).

Statistical analysis
Redundancy analysis (RDA) was used to examine the relationship between soil factors and the distribution of genospecies and recA clusters (Table S3, Supporting Information) identified in the study using the R program. The soil variables were first transformed (log X + 1) and normalized to overcome the different measurement units (Clarke and Gorley 2001). Multivariate analyses of variance (MANOVA) was performed to determine the presence of significant differences between soil properties of the sampling sites (Bates 2005). The soil variables that explained variations significantly between genospecies and recA clusters were determined using the 'ordistep' (forward selection) method embedded in the 'vegan' package of the R software. The significance level for assessing differences was set at 5% and a permutation test (permutation = 999) was done. To avoid multicollinearity effects, all highly correlated or repetitive variables were removed but one and the fitted model was selected accordingly to visualize the resulting RDA plot (Bates 2005, Borcard et al. 2011.

Soil properties of the sampling sites
The physicochemical properties of the soils are presented in Table 1. Sand was the dominant fraction by 50-69% in all study soils, and the textural classes comprised sandy loam (11 soils, moderately coarse-textured), sandy clay loam (six soils, moderately finetextured) and loamy sand (two soils, coarse-textured). The soils were acidic, the pH (H 2 O) covering a range from 4.5 to 6.2, where four soils had a pH below 5.0 and only one had a pH above 6. The low pH indicates that the soils were noncalcareous. Organic carbon content was rather low, ranging from 0.4% to 1.8%, mean 0.8%. The CEC ef varied from 3.8 to 18.2 cmol (+) kg -1 with 16 soils having a CEC ef <10 cmol (+) kg -1 . The CEC ef correlated closely with C content (r = 0.87, P < 0.001), suggesting that organic matter probably provided much of the CEC sites, while the correlation between CEC ef and clay content was lower (r = 0.57, P = 0.0135). When calculated on the basis of CEC ef , and using the assumption that organic matter ( = 1.724 x organic C) has a CEC of 200 cmol kg -1 , clay had an average CEC of 26 cmol kg -1 (range 9-42 cmol kg -1 ). Even though a few soils may also have contained some high-activity clays, the soils were mostly dominated by low-activity clay minerals, likely kaolinite. This conclusion is also supported by the soil classification. Electrical conductivity was not determined but the very low concentration of Na in most soils suggests that the soils were nonsaline and non-sodic. As usual, exchangeable Al correlated closely (r = −0.91, P < 0.0001) with soil pH. The most acidic soils (Manga, Upando, Golinga and Kunfabiala) had a base saturation between 64% and 78%, calculated from the CEC ef , suggesting a chance of a moderate Al toxicity. In the other soils, the base saturation was >83% with a low probability of Al toxicity. The variation of plant-available P covered a range from very low (0-3 mg kg -1 ) to high (>20 mg kg -1 ) concentrations, most soils falling into the low (3-11 mg kg -1 ) category, and only one soil (Golinga) was in a very low range (<3 mg kg -1 ). On the other hand, all the three soils, Bamahu, Tanina and Gunayili, fell in the high category with P concentrations <30 mg kg -1 , thus being not excessively rich in plant-available P (Table 1).

Bacterial isolates
A total of 102 bacterial isolates sampled from 19 locations in the Guinea and Sudan savanna agroecological zones of Ghana were recovered from root nodules of three Bambara groundnut landraces: 35 isolates from the wine seed-coated race, 28 isolates from the cream seed-coated race and 32 isolates from the black seed-coated race, while the remaining seven isolates associated with the soybean variety, Favor. All the isolates obtained showed growth characteristic of Bradyrhizobium; they were slow growing and showed an alkaline reaction on YMA plates containing bromothymol blue. Of the 54 representative isolates selected for authentication, 44 were able to effectively nodulate Bambara groundnut. The distribution of the isolates with respect to the sampling sites and information on nodulation are shown in Table S1 (Supporting Information) and Table 2, respectively.

Identification and single gene phylogenies
Based on the recA gene sequence (375 bp) comparisons, all 102 isolates were identified as Bradyrhizobium spp. These strains were separated into thirteen phylogenetic clusters designated as clusters I-XIII (Fig. 1) and shared a sequence similarity of ≥96% within each group (data not shown). The clusters II, III, IV, VIII, XII and XIII were major clusters that contained between 5 and 28 strains. Clusters I, V, VI, VII, IX, X and XI were minor clusters with only 1-3 strains. Strains in cluster I grouped closely with the type strains of B. yuanmingense, B. vignae, B. cajani, B. centrosematis and/or B. guangxiense, respectively, and shared a sequence similarity of 96-100% with their respective reference strains. Strains in clusters II, III, IV, VII, VIII, IX, X, XII and XIII showed <96% sequence similarities with reference species and did not group with any described species. The phylogeny of 54 strains representing each recA cluster was further studied by the sequence analysis of glnII (423 bp), gyrB (419 bp), rpoB (435 bp), atpD (426 bp) and dnaK (246 bp). The phylogenies of the individual housekeeping genes are presented in Figs S1-S6 (Supporting Information). We reconstructed recA phylogeny with only 54 strains to compare the phylogenetic groups with other single gene phylogenies. The 13 clusters (I-XIII) obtained in Fig. 1 were also consistently recovered in the new recA phylogeny (Fig. S6, Supporting Information). The strains in the gyrB and atpD trees were grouped into 14 clusters each (Figs S1 and S2, Supporting Information). The dnaK, rpoB and glnII phylogenetic trees separated the strains into 9, 11 and 13 clusters, respectively (Figs S3-S5, Supporting Information). While strains (e.g. strains from recA clusters VI, VIII, IX, X, XI, XII, XIII) consistently clustered together across the single gene phylogenetic trees, other groups are also mostly consitent but showed aberrant clustering among strains and with reference strains only in one or two gene trees (e.g. strains from the recA clusters I, II, IV. V, VII) ( Table 2; Figs S1-S6, Supporting Information).

MLSA phylogeny
To further refine the taxonomic position of the representative strains, a concatenated phylogenetic analysis based on six protein-coding housekeeping genes (recA-glnII-gyrB-rpoB-dnaK-atpD) was done. The concatenated sequence contained 2127 analyzed sites of which 1182 sites were conserved, 766 were distinct patterns and 783 were parsimony-informative (Table S2, Supporting Information). Considering phylogenetic groupings as an evolutionary parameter, the concatenated phylogeny confirmed the presence of diverse and novel type Bradyrhizobium species separated in twelve clusters designated as genospecies I-XII (Fig. 2). Most of the 54 representative strains clustered in the two well-known supergroups, B. japonicum (genospecies I-VII and XII) and B. elkanii (genospecies VIII and IX). The strains that formed genospecies X and XI however, clustered in a well-supported distinct branch (99/100). The information on the taxonomic affiliation of the genospecies is presented in Table 2. Genospecies I, II, VII, VIII, IX and XII were identified as major clusters containing 4-16 strains that were isolated from 2-10 locations. While four of the genospecies (I, IV, V and VII) identified in this study belonged to recognized Bradyrhizobium species, eight (II, III, VI, VIII, IX, X, XI and XII) were unique; forming separate clusters without including known close references. Some of the strains grouped under genospecies I formed aberrant clusters in the single gene phylogenies but were grouped with B. yuanmingense (Branch sup-port (BS) = 99/99) in the concatenated phylogenetic tree. These strains were widely distributed in ten of the isolation sites of the study. The strains that constituted genospecies II did not cluster with reference strains in the concatenated phylogenetic tree. Strains BGS3B6, BGW7B5, BGN2P8 and BGW20C2 stood alone and did not cluster with known reference strains in the combined gene tree but showed distant relationship with B. huanghuaihaiense (95% sequence similarity), B. arachidis (95.9% sequence similarity) and B. denitrificans (91.3-92% sequence similarity), respectively. In genospecies IV, a close relationship was observed between B. guangxiense and strains BGN1C6 and BGW8P7 (BS = 99.8/100) that were isolated from two distinct locations in the Guinea savanna agroecological zone. The closest related species of genospecies V was B. arachidis, sharing a concatenated sequence similarity of 96.1%. Members of genospecies VII consistently clustered with B. vignae in all single gene trees and also in the concatenated phylogeny (BS = 100/100), sharing a sequence similarity range of 99.2-99.8% with the reference strain. The eleven strains included in genospecies VII were distributed across eight sampling sites (Table  S3, Supporting Information). Genospecies VIII harbored only nine potentially unique strains from this study, sharing a sequence similarity of 95-95.3% in the concatenated phylogeny with the closest species B. embrapense. The cluster formed by strains in genospecies IX did not include known references and were far from the closest reference B. mercantei (95.4-95.9% sequence similarity). Genospecies XII was constituted by only four potentially unique strains identified in this study. These strains shared a sequence similarity range of 96.1-96.5% with the closest reference, B. yuanmingense. In general, while nine of the soils used in the study hosted both known and unnamed genospecies, four other soils harbored only strains identified as potentially unique species (Table S3, Supporting Information).

Symbiotic phylogenies
To determine the phylogenies of symbiotic genes, the nucleotide sequences of the nodA (465 bp) and nifH (559 bp) genes were obtained for 43 and 48 representative strains, respectively. The phylogenies of the nodA and nifH genes were examined by including relevant Bradyrhizobium reference sequences available from GenBank (Figs 3 and 4). Based on the nodA phylogenetic tree, all the representative strains were found to belong to nodA clade III.3 (Moulin et al. 2004, Stępkowski et al. 2005. The strains from this study were further delineated into seven subclusters within nodA Clade III.3. In Clade III.3a, 18 representative strains from four genospecies (I, II, VII and IX) grouped with the type strain of B. vignae (BS = 97/100) (Grönemeyer et al. 2016) originally isolated from Vigna unguiculata in Namibia. All strains from genospecies II (excluding strains SWLF2 and BGE11C3) clustered together in nodA Clade III.3b and grouped with Bradyrhizobium sp. ORS1810 (BS = 100/100) (Moulin et al. 2004) isolated from Crotalaria lathyroides in Senegal. Strain SWLF2, however, stood alone without clustering with any reference strain in Clade III.3c. The nodA Clade III.3d included B. japonicum strain H2 (BS = 81/100) (Risal et al. 2012) originally isolated from Vigna radiata in Nepal together with two Bradyrhizobium strains from genospecies I. Clade III.3e harbored five strains from two genospecies (I and XII) and clustered with the non-photosynthetic Bradyrhizobium sp. ORS305 (BS = 100/100) (Nzoué et al. 2009) isolated from Aeschynomene schimperi growing in Senegal. Clade III.3f was represented by three strains from Genospecies IX and grouped with Bradyrhizobium sp. ORS130 (Dupuy et al. 1994) originally isolated from Faidherbia albida growing in Senegal. The nodA Clade III.3g was occupied by nine strains from genospecies VIII and included Bradyrhizobium sp. ORS336 (BS = 99/100) (Molouba et al. 1999) isolated from Aeschynomene afraspera growing in Senegal.
The nifH of the representative strains were separated in seven distinct clusters identified as groups 1-8 (Fig. 4). Branches formed by the strains in the nifH groups 1-4 were mostly similar to those formed by nodA Clades III.3a, III.3b, III.3d and III.3e. Groups 1 and 2 did not cluster with any reference strain but in groups 3 and 4, the representative strains clustered closely with B. arachidis (BS = 93/99) and B. vignae (BS = 98/99), respectively. Strains within genospecies VIII and IX were separated into two clusters in the nodA (Clades III.3f and III.3g) phylogenetic tree but formed a single cluster in the nifH phylogeny (group 5). nifH groups 6, 7 and 8 did not include any reference strains and were represented by strains identified in this study as genospecies IV, X and XI, respectively. Generally, the nodA and nifH phylogenies of the representative strains were mostly not congruent with the concatenated phylogenetic tree.

Effect of soil physicochemical factors on the distribution of genospecies and recA groups
The MANOVA test result revealed that the soil variables were significantly different (P < 0.001, data not shown) among recA clusters, and between the genospecies (Table 1; Table S3, Supporting Information). In the RDA result presented in Fig. 5, the first two axes explained most of the variation (cumulative 40%) presented between the recA clusters across the sampling sites and thus, the two axes in the plot represented the data well. The soil pH, exchangeable acidity (Al), Mn, K, and among the textural fractions sand (Sa) and silt (Si) appeared to be the main factors affecting the distribution of the recA clusters. The distribution of recA clusters rAIV and rAVI tended to follow the soil pH and sand contents. Soil exchangeable Al and Si exhibited a positive influence on the distribution of recA clusters rAVIII, rAXII and rAXIII. The soil Mn and K showed to have a positive effect on the distribution of rAII, but negatively affected the distribution of rAII, rAVIII and rAXIII. Soil P also had a negative effect on the distribution of rAII, rAVIII, rAXII and rAXIII. As shown in Fig. 6, in the RDA plot the first two axes explained 45% of the variations present between the genospecies across sampling sites. The soil pH, Mn and Al appeared to have the strongest effect on the distribution of the genospecies, followed by the silt fraction of the soil and the soil P content. The distribution of GSI and GSII appeared to be influenced positively by soil pH and negatively by clay content. The soil Mn and Al had a positive effect on distribution of the test strains that formed GSVII and GSVIII, respectively. Although the clay and Si tended to have an influence on the distribution of GSIX, the genospecies held a position closer to the center of the plot, suggesting that the contribution of the soil variables to the distribution of this genospecies was small.

Discussion
The BNF process contributes greatly to the environment and to agricultural sustainability. Its success during the introduction of new legume crops largely depends on the selection of highly effective rhizobial strains, preferable from the native population. These populations, however, contain diverse rhizobial species with varied effectiveness. Thus, an understanding of the diversity and taxonomy of native rhizobial species is a crucial step toward improving legume productivity using BNF. We analyzed rhizobial isolates recovered from root nodules of Bambara groundnut and soybean grown in soils sampled from the northern savanna zones of Ghana to determine their diversity and taxonomic affiliation. The northern savanna zones are characterized by harsh climates and heterogeneous soils that are generally low in plant nutrients (Bationo et al. 2018). Our sampling sites had no history of inoculation and therefore the rhizobia recovered in this study represent an indigenous population that is well adapted to the local environmental conditions.
By combining the phylogenetic information of several proteincoding housekeeping genes, MLSA allows adequate classification of closely related rhizobial species (Mousavi et al. 2015, Grönemeyer et al. 2016, Chidebe et al. 2018. This reduces the possible impact of horizontal gene transfers (HGTs) that could affect single gene analysis (Gevers et al. 2005, Martens et al. 2008. At least four to five genes should be considered for MLSA (Gevers et al. 2005). In our study, we used six housekeeping genes (recA, glnII, dnaK, atpD, gyrB and rpoB) to generate phylogenetic trees and subsequently establish phylogenies. Considering mainly the phylogenetic groupings of the strains" our MLSA results revealed the presence of di-verse groups of Bradyrhizobium as predominant symbionts of Bambara groundnut and soybean. According to Martínez-Romero and Caballero-Mellado (1996), rhizobial diversity is higher in centers of host plant origins than in environments where the host has been introduced. Correspondingly, Andronov et al. (2003) and Österman et al. (2011) reported that the diversity of Neorhizobium galegae sv. orientalis is higher in the Caucasus that is the center of origin for its host, Galega orientalis, than the diversity of N. galegae sv. officinalis is in this area. Our study revealed diverse indigenous Bradyrhizobium species as nodule symbionts of Bambara groundnut in Africa, the center of origin for this legume (Grönemeyer et al. 2014, Puozaa et al. 2017, Ibny et al. 2019. Also, its distinct and diverse genospecies identified as symbionts confirm the promiscuity of this legume (Grönemeyer et al. 2014, Puozaa et al. 2017, Ibny et al. 2019. According to the phylogenetic analysis of the concatenated sequences, many of the representative Bradyrhizobium strains were classified into two main lineages, the B. japonicum and B. elkanii supergroup. The B. japonicum supergroup is reported to have a much broader geographical distribution than the B. elkanii supergroup (Vinuesa et al. 2008, Stępkowski et al. 2018, and many of the strains known to be nodule symbionts of the diverse leguminous crops cultivated in Africa are affiliated to species of this supergroup . For instance, B. kavangense (Grönemeyer et al. 2015a) and B. daquingense (Steenkamp et al. 2008) were identified as microsymbionts of cowpea in soils of Southern Africa (Angola and Namibia) and Botswana, respectively. In Namibian and Ghanaian soils, respectively, strains of B. subterraneum (Grönemeyer et al. 2015b) and B. vignae (Puozaa et al. 2017) have been isolated from nodules of Bambara groundnut. Likewise, B. japonicum and B. yuamingense have been found associating with soybean and groundnut, respectively, in Mozambique (Chibeba et al. 2017) and Ghana (Osei et al. 2018), and many of the strains known to be nodule symbionts of the diverse leguminous crops cultivated in Africa are affiliated to species of this supergroup (Jaiswal and Dakora 2019). For instance, B. kavangense (Grönemeyer et al. 2015a) and B. daquingense (Steenkamp et al. 2008) were identified as microsym-  bionts of cowpea in soils of Southern Africa (Angola and Namibia) and Botswana, respectively. In Namibian and Ghanaian soils, respectively, strains of B. subterraneum (Grönemeyer et al. 2015b) and B. vignae (Puozaa et al. 2017) have been isolated from nodules of Bambara groundnut. Likewise, B. japonicum and B. yuamingense have been found associating with soybean and groundnut, respectively, in Mozambique and Ghana. In our study also the members that constituted the B. japonicum lineage were found in nearly 90% of the isolation sites, relative to the B. elkanii supergroup that was only limited to seven sampling sites.
Of the twelve genospecies we identified, four of the genospecies were conspecific with recognized Bradyrhizobium species in the B. japonicum supergroup. The remaining strains (except BGN2P8 and BGW20C2) that were distributed in both the B. japonicum and B. elkanii supergroups, appeared unique and separated from all described species. These unique genospecies could most likely be novel Bradyrhizobium species whose identities can be justified through further taxonomic analysis and by applying a set of criteria designed for bacterial species description (De Lajudie et al. 2019). Also, strains of the potentially unique groups were widespread in soils of twelve locations, with the soils of four locations only containing strains of the unique groups identified in this study. Our results corroborate various diversity studies conducted in Africa that have shown that novel Bradyrhizobium species are dominant and widespread in many African soils (Steenkamp et al. 2008, Aserse et al. 2012b, Grönemeyer et al. 2014, Puozaa et al. 2017, Mohammed et al. 2018). Thus, this calls for continuous exploration of new geographical areas to uncover the full diversity that may provide a vast resource for development of adapted inoculants that can greatly profit regions where smallholder farming is widespread. The species B. pachyrhizi, B. vignae, B. yuanmingense, B. daqingense and B. subterraneum have been reported as nodule symbionts of Bambara groundnut in African soils (Grönemeyer et al. 2015b, Puozaa et al. 2017, Ibny et al. 2019. In this study also, strains from genospecies I and VII were related to B. yuanmingense and B. vignae, respectively. Bradyrhizobium yuanmingense, originally recovered from Lespedeza cuneata, a native tree grown in China (Yao et al. 2002) has a broad distribution and host range. This species has been isolated from diverse geographical regions, e.g. Mexico (Vinuesa et al. 2008), Botswana (Steenkamp et al. 2008), Peru (Ormeno-Orrillo et al. 2006, Ghana (Osei et al. 2018), ) and Mozambique (Chidebe et al. 2018) and from nodules of legumes such as cowpea, groundnut and lima bean. Bradyrhizobium vignae however, is mainly distributed in Africa from countries such as Senegal (Li et al. 2015), Namibia (Grönemeyer et al. 2016), South Africa and Ghana (Puozaa et al. 2017). In our study, genospecies IV and V had a close relationship with B. guangxiense and B. arachidis, respectively, both originally isolated from Arachis hypogaea in China (Wang et al. 2013b, Li et al. 2015. To the best of our knowledge, our study is the first to have isolated strains of B. guangxiense and this recognition expands knowledge on the known symbionts of Bambara groundnut.
Physical and chemical properties of soil play an important role in influencing microbial communities, composition and functions (Zhang et al. 2011). Edaphic parameters such as pH, organic carbon (C), particle size distribution, nitrogen (N), calcium (Ca) and magnesium (Mg) have been shown to be the main determinants of legume rhizobial distribution in many geographical locations (Puozaa et al. 2019, Asfaw et al. 2020. There are reports of geographical locations with soils predominantly composed of clay and silt showing higher rhizobial diversity than in locations with larger soil particle sizes (Sessitsch et al. 2002). Other studies have also shown pH as a major determinant of bacterial diversity and distribution (Giongo et al. 2008, Puozaa et al. 2017). In the current study, generally, the soil properties of the sampling sites could have influence the distribution of both the recA clusters and genospecies. The soil pH, clay, silt and sand content, and exchangeable Al, Mn and K were among the main factors influencing the distribution of the majority of recA clusters and/or the genospecies (Figs 5 and 6). It should be emphasized that the concentration of exchangeable Al reflects soil pH and this study thus also shows the importance of soil acidity in the composition of soil microbiota (Ndungu et al. 2018, Puozaa et al. 2019.
The nodA and nifH genes are important symbiotic genes that code for nodulation and nitrogen fixation, respectively. Usually, these genes are located on symbiotic islands that most likely increases the possibility of gene transfer between different bacterial species (Haukka et al. 1998, Andrews et al. 2018. As such, the observation that some of our nodule isolates could not reinfect their homologous host may have been as a result of a loss of the symbiotic loci as was also observed in a study conducted by (Angelini et al. 2011). Furthermore, rather than showing species relationship, in our study the nodA gene appears to reveal information on geographic origin or host range. Studies by several authors have demonstrated the monophyletic origin of this gene in Bradyrhizobium (Moulin et al. 2004, Stepkowski et al. 2007, Aserse et al. 2012b, Beukes et al. 2016. Recently, 16 nodA clades designated as Clades I-XVI have been described for this genus, with the Clade III being highly diversified and cosmopolitan, including members that are widespread in Sub-Saharan Africa (SSA), Australia, the Americas and in southern and eastern Asia (Stępkowski et al. 2007, Steenkamp et al. 2008, Beukes et al. 2016. The majority of Bradyrhizobium strains isolated from SSA form part of nodA clade III and are able to establish symbiosis with both native and cultivated legumes (Steenkamp et al. 2008, Asers et al. 2012b. Accordingly, all strains detected in this study belonged to the nodA Clade III and were clustered with reference species mainly originating from Africa. It has been widely reported that the phylogenies of the symbiotic and housekeeping genes often show different evolutionary histories due to HGT (Moulin et al. 2004, Stępkowski et al. 2005, Menna and Hungria 2011.
The events of HGT result in phylogenetically distant microsymbionts carrying similar symbiotic genes (Horn et al. 2014). Our results also showed that the nodA and nifH phylogenies were largely incongruent with the combined housekeeping gene tree. For instance, strains from genospecies I were divided into three groups under the highly diverse nodA Clade III.3 (assigned as III.3a, III.3b and III.3d). Similarly, genospecies II and IX were distributed in nodA Clades III.3c-e and III.3d-f, respectively. Also, the representative test strains from genospecies VIII and IX formed a single nifH group (group 5). These results are consistent with findings of other authors and show that HGT is a common phenomenon in symbiotic genes of the genus Bradyrhizobium (Moulin et al. 2004, Koppell and Parker 2012, Huang et al. 2018. Notwithstanding these observations, the consistent clustering of B. vignae with some of our strains in both symbiotic and individual, as well as combined housekeeping gene phylogenies, reflects a possible vertical gene transfer that was also reported in previous studies (Aserse et al. 2012b, Chen et al. 2014. Furthermore, the results showed representative strains in nifH groups 1, 2 and 6 did not group closely with known reference strains that could indicate the presence of unique nifH variants for Bambara groundnut and soybean rhizobia in Ghana.

Conclusion
From the results obtained in this study, we conclude that the Bambara groundnut is nodulated by diverse and unique Bradyrhizobium genospecies that appear to be widely distributed in the soils of the northern savanna zones of Ghana. The unnamed genospecies, which potentially represent novel species, would require further characterization for valid description. The phylogenies of the housekeeping genes were generally incongruent with the symbiotic genes indicating an independent evolutionary his-tory between the housekeeping genes and the symbiotic genes. The soil factors analyzed in the study contributed to structuring the bradyrhizobial communities detected at the sampling sites.